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Abstract. In this paper we study the relationship between query and 
search engine by exploring the adaptive properties based on a simple 
search engine. We used set theory and utilized the words and terms for 
defining singleton space of event in a search engine model, and then 
provided the inclusion between one singleton to another. 
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1 Introduction 

A search engine on the World Wide Web, in brief we called it as Web, is exten- 
sively important to help users to find relevant information. The search engines 
have some features for servicing the tasks and subtasks that directly or indirectly 
uses the techniques such as indexing, filters, hub, page rank, hits, and etc [T], 
but to access any information in Web the users need the formulating a query 
about the required information. In this case, the query has become the lead- 
ing paradigm to find the information, whereby the information retreival (IR) is 
concerned with answering information need as accurately as possible. However, 
the users lack understand a formulae of query. Moreover, almost all of search 
of engines is not provide any function to find the special cases such as entity 
or actors. Therefore, the major challenge in information access is to provide the 
riched and trusted information. This paper is aimed at generating some adaptive 
properties of relation between an search engine and a query. 



2 Basic Concept and Motivation 

Let objects (entities or attributes) can be given literally, like the literal text of 
" Social Network" , then all meaning of objects based on words is represented by 
the literal objects itself. To realize it, first we define formally that a word w is 
the basic unit of discrete data, defined to be an item from a vocabulary indexed 
by {1, ... , K}, where Wk = 1 if k S K , and Wk = otherwise [2]. Then, we define 
some instances related to words. 
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Definition 1. A term t x consists of at least one or a set of words in a pattern, 
or tk = {w\W2 ■ ■ ■ wi), I < k, k is a number of parameters representing word w, 
I is the number of tokens (vocabularies) in tk, \tk\ = k is size of tk- ■ 

We define a simple search engine as follows. 

Definition 2. Let a set of web pages indexed by search engine be fi, i.e., a 
set contains ordered pair of the terms tk t and the web pages Wfc-, or (tk^Wkj), 
i = j — 1, . . . , J. The relation table that consists of two columns tk 

and LVk is a representation of (ife 4 ,Wfcj) where Q\~ — {(tk,^k)ij} C fi or fij~ = 
{uJkn ■ ■ ■ T^kj}- The cardinality of fi is denoted by \fi\. ■ 

In Definition [3J we assume that fi is made of a set of index of terms t ki , we 
will call it as a space of term. So, the web pages and queries are represented 
as vectors in Q is also a space of event, whereby the semantics of this space is 
that of a multidimensional space. Therefore, a term tk is represented as a vector 
of web pages, i.e., the meaning of a term to be Wk S fi in which tk occurs. 
Let q is a query, then tk S q, for tk — (wxWi ■ ■ ■ tffc)- In logical implication, 
a web page is relevant to a query if it implies the query, that is if to =>■ q is 
true or ui =>• tk is true Vu; e fl: (w 4 (J = 1 |, but for 2 k — 2 anothers of 

{{tf ~ 2 } C {w 1 ,w 2 ,...,w k } = t k }, also uj q is true V{tf _2 } ^ 0. Thus, the 
degree of uj q measured by P(oj q), and probability t x in power subsets of 
{wi,w 2 ,...,w k }, 

P{h) = -^ k -—-,t k = (w 1 W2-..w k )- (1) 
Therefore there are an uniform mass probability function for fi, 



P : fi -)• [0, 1] (2) 

where = 1. 

Definition 3. Let t x is a search term, and t x 6 S where S is a set of singleton 
search term of search engine. A vector space fi x C fi is a singleton search 
engine event ( singleton space of event ) of web pages that contain an occurrence 
of t x 6 lj x . The cardinality of fi x is denoted by \ fi x \ . ■ 



In the singleton space of event, fi x C fi if ui => t x is true, or 

n \ _ / 1 if ta; is true at uj 6 fi, /„% 
,Mt,J "l0 otherwise {6) 

and the cardinality of fi x be \fi x \ = ^2n(fix{tx) = !)• This means that every 
web page that is indexed by search engine contains at least one occurrence of a 
search term, then we can measure its degree of uncertainty of uj =>■ t x on to => q 
by 

P{fi x ) = P{fi x {t x ) = 1) = = X) - ^ (4) 
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For example, a search term is a person name: x = "Mahyuddin Khairuddin 
Matyuso Nasution", then {t x } = {wi, w 2l w 3 , W4} — {"Mahyuddin" /'Khairud- 
din", "Maty uso","Nasution"}. At the time of doing the experiment, a Yahoo! 
search for "Mahyuddin Khairuddin Matyuso Nasution" returned \(l x \ =? hits or 
\f2 x \ = 3, 440 for " Mahyuddin K. M. Nasution" , and the number of hits of search 
for Wi=i,2,3,4 are in {54, 300; 3, 187, 000; 0; 275, 000}. The vector space of t x is of 
{t 2 k _1 } = {{w 1 },{w 2 },{w 3 },{w 4 },{w 1 ,w 2 },{w 1 ,w 3 },{w 1 ,w 4 }, {w 2 ,w 3 },{w 2 , 

W 4 },{w 3 ,W 4 }, {wi,W 2 ,W 3 },{wi,W 2 ,W 4 }, { w 2 , w 3 , W 4 } , { Wi , w 2 , w 3 ,w 4 }}. We 

have also \fi Xp \ = 55 from Yahoo search engine for t x with its pattern as a 
meaning core of Q x , 

I'TCgJ = =»• tx) < \^x\, (5) 

(2 

where ^2q(w x =$-t x ) is the number of web pages containing t x with the pattern 
exactly. The singleton space of event captures in a particular sense all background 
knowledge about the search terms concerned available on the Web, geometrically 
this is a representation of meaning semantically. 

Similarly, for two search terms t x and t v in the different queries, we have 

n a nfiv = {Q x {t y ) = 0) a (si y (t x ) = 0) = (6) 

i.e. any two singleton spaces of event are independent. 

Problem 1. Let t x and t v are two different search terms, t x ^ t y . Let fl x and £l v 
are the singleton search engine events of t x and t y , respectively, and \t y \ < \t x \ 
or Vwi £ t y , u>i € t x , 3wj G t x , Wj $ t y , then 

\O x \l\n x \ + \Q y \ (7) 

where fi x , fl v C [2. 

Problem [T] is a property of relation between any search engine and any query 
in a heterogeneous environment such as Web, and the information about any 
object to be scattered in various places. So in almost all measurements the bias 
exist. 

3 The Adaptive Properties in Search Engine 

Numerous studies of natural language processing (NLP) and Semantic Web uti- 
lize a search engine, mainly to obtain a set of documents that include a given 
query and to get statistical information about an object such as hit count of en- 
tity name, but to bring the NLP and Semantic Web to life such as the information 
processing services provide the knowledge, for example: ontology construction, 
knowledge extraction, question answering, and other purposes [3] needs more 
effort. 

Some properties we will derive to learn how to get the efficient ways to 
access and extract information from web. The purpose of this construction is to 
eliminate the bias by developing the adaptive model of relation between a search 
engine and the search terms. 
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Lemma 1. Let t x and t y are search term. If t x ^ t y , t x C\t y ^ and \t v \ < \t x \, 
then singleton search engine event of t x and t y is fl x = Q x U f2 y or 

\n x \ = \{2 x \ + \{2 y \, (8) 

where (2 X , Q y C Q. 

Proof. For all search terms t x and t y where t x ^ t y , t x nt y ^ and \t y \ < \t x \, 
by Definition [JJ and Definition [2] we have \/w y € t y , w y € t x , 3w x € t Xl w x ^ t y 
=> Vwj, GWj, w y € u) x , 3w x € u x , us x ^ such that 

n ty = t y and t x Ut y = t x (9) 

and 

ojj; n w y = ojy and WjUoij =w s . (10) 
By Eq. ©, clear that Q x ^ J7 y and 1^ D Q y \ = 0, then we have 

\Q x un v \ = \n x \ + \Qy\. (11) 

Let i7 x = {(t x ,u> x )}, based on meaning Eq. © and Eq. (ITU|) . we have i7 x = 

{(tx,W X )} = {(t x Ut y ,U> x ULL>y)} = {(t X ,0J X )l)(ty,U)y)} = {(t x ,LO x )}U{(t y ,UJ y )} = 

^ x U J7j,. Therefore based on Eq. ([IT]) the Eq. in Problem [JJ be = 

\n x \ + \n y \. I 

Proposition 1. Let t z , . . . ,t y ,t x are search terms, where t z ^ . . . ^ t y =/= t x and 
\t z \ < . . . < \t y \ < \t x \, then fl x = [2 X {J(2 V holds recursively or \Q X \ = |^a;| + |^y|; 

Proof. By the Lemma [JJ and an assumption that \t z \ < ... < \t y \ < \t x \, we 
obtain \t z \ < \t Zl \ \f2 Zl \ = \f2 Zl \ + \O z \, \t Zl \ < \t Z2 \ \[2 Z2 \ = \f2 Z2 \ + \f2 Zl \, 
. . ., \t y \ < \t x \ \f) x \ — \f2 x \ + \O y \. Because of the inter- independence in the 
queries such as Eq. ([5]), we obtain J7 X n fi y = 0, . . ., f2 Zl (1 (2 Z =0, and f] x U Q y 
belonging to S7 X , then 



| ^x 


U Qy\ 




\ ^x 


+ \Vy\ 




\ ^x 


+ \O y U... 


1 


\ ^x 


+ l^al + • ■ 




\ ^x 


+ Wy\ + 1 • 


..un z 


\ ^x 


+ Wy\ + ■ ■ 


. + \Q Z 



or 1 1?^ | = \Q X \ + \ ^y \ be recursive, where \fiy\ + . . . + \f2 z \ is a part of \£2 X \. I 

Lemma 2. // t y ^ t z and t y D t z — 0, then \Sl y n f2 z \ = and |J7 y U fi z \ = 
\!2 y \ + \!7 Z \. 
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Proof. For all search terms t y and t z where t y ^ t z and t y nt z = 0, by Definition 
[JJand Definition [5] we obtain \fw y e t y , w y t z A Vw z S i z , w z ^ t y ^> \fw y £ u;,,, 
w y ^ uj z A \/w z 6 w 2 , io z ^ Wj such that 

i z n ij, = V t z U ij, = t y U ^ (12) 

and 

Wj flw 2 = V uj y U u; 2 = u; 2 U u> y (13) 

Let J7 y = {(t y ,Lo y )} and J? 2 = {(t z ,uj z )} are two independent events from the 
queries, based on Eq. ^ we obtain ft y D ft z — and 

\n y nn z \ = o (14) 

and by combining the meaning of (IT21) . (fT3|) . ([T4]) . and {(£,,, U {{t Zl uj z )} = 
Q y U ft z and we can conclude that | ft y U J? 2 = | X? a | + X2 Z | . I 

Lemma [5] expresses that Eq. ([7]) in Problem [1] be | J7 X | ^ | + | ft y | or 

|^u^| = \ft x \ + \ft y \. 

Proposition 2. Let Q x n J7 y = and f2 Q fl = 0. // = |J?a;| + \ft a \ and 
\O y \ = \ft y \ + \ft b \, then \S7 X n ft y \ > 0. 

Proof. This is a direct consequence of Lemma [JJ and Lemma [2] I 

Lemma 3. Lett x andt z are search terms. Ift x 7^ t z , t x C\t z — 0, and lj x C\uj z 7^ 
0, then \ft x \ = \ft z \, ftx, ftz ^ ft- 

Proof. For all search terms t x and t z where t x ^ t z , t x n £ 2 = and WjflWj 7^ 0, 
by Definition [1] and Definition [2] we obtain \/w x € t x , w x $ t Zl and Viu 2 G t 2 , 
w z $ t x then 

t x r\t z = ®vt x ut z =t z ut x , (15) 

but VWr e m) x 6 w z and Vw z E uj z , w z 6w z then 

W X C\LU Z — L) x — L) z , LO x U UJ Z = UJ Z U LU X — LJ X = LU Z . (16) 

For ft x = {(t x ,uj x )} and ft z = {(t z ,uj z )} we have ft x f\ ft z = {(t x ,u x )} ("1 
{(i 2 ,u; 2 )} = {(t x ,oj 2 )}n{(i 2 ,w 2 )}, and because t z G w 2 the intersection of fli 2 
must be {(t x , u) z )} tl{(t z , v z )} = {(t z , w 2 )} n {(t z , uj z )} or ft x n I2 2 = ^ 2 n J7 Z or 
i7 x fl i7 2 = ft z . Similarly, n i? z = Thus \ft x \ — \ft z \. I 

This Lemma explains that Eq. ([7]) in Problem [1] be | 4? x = \ ft y \ if and only 
if t x 7^ ij, but t x ,t y € uj x A In other word, based on combining (|15D 

and (HU) /2 a = {(t x ,u) x )} = {(t x ,u: x U uj v )} = {{t Xl uj x ) U (i^t^)} = {(t y ,w x ) U 
(ty,w y )} = {(t y ,uj x Uuj y )} = { (t y , uj y ) } = i7 y . This shows that the search terms 
may be different but they come from same web pages, and in this case they take 
the same meaning from web. 
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4 Conclusions and Future Work 

Studying to properties of relation between query and search engine gave the 
understanding about the semantic representation statistically for object in literal 
text. Our near future work is to generate some properties of search engine for 
doubleton. 
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